Global Cache: Installation and Operations Guide
Global Cache: Installation and Operations Guide
Global Cache: Installation and Operations Guide
Google Confidential
GGC Partners Only
Contents
1 Installation and Commissioning Process Overview 1
2 Hardware Installation 2
2.1 You will need 2
2.2 Procedure 2
2.3 Disk layout 2
2.4 More information 2
3 Switch Configuration 4
3.1 You will need 4
3.2 Procedure 4
3.3 Switch Configuration Examples 4
3.3.1 Cisco Switch Configuration Fragment 4
3.3.2 Juniper Switch Configuration Fragment 5
4 IP Addressing 6
4.1 IPv4 6
4.1.1 Addressing scheme 6
4.1.2 Server Naming / Reverse DNS 6
4.2 IPv6 7
4.2.1 Addressing scheme 7
4.2.2 IPv6 Enablement 7
4.3 Proxies and Filters 7
5 Software Installation 8
5.1 You will need 8
5.2 Preparing the USB stick (drive) 8
5.2.1 Create the USB boot stick on Microsoft Windows 8
5.2.2 Create the USB boot stick on Mac 9
5.2.3 Create the USB boot stick on Linux 10
5.3 GGC Software Installation 10
5.4 GGC Software Reinstallation 13
5.5 When things go wrong 15
6 BGP Configuration 16
6.1 You will need 16
6.2 Procedure 16
6.3 What to Announce Over the Peering Session 16
6.3.1 User and Resolver Prefixes 17
6.3.2 Peers and Downstream ASNs 17
6.4 Multiple Cache Nodes 17
6.5 BGP Peer Configuration Examples 17
6.5.1 Cisco Option 1: Prefix list based route filtering 17
6.5.2 Cisco Option 2: AS-PATH based route filtering 17
6.5.3 Juniper Option 1: Prefix based policy 18
6.5.4 Juniper Option 2: AS-PATH based policy 18
Shaded steps
Complete contact and are completed
shipping information on by the ISP
peering.google.com
Equipment arrives
2.2 Procedure
Note: Both power supplies must be connected. It is strongly recommended that you connect each power supply
to an independent power feed (i.e., A and B power). However, both can be connected to the same circuit if a
second circuit is not available. This will at least protect from failure of a single power supply.
In the case where a disk is showing errors, the GGC operations will contact you and ask you to re-seat or replace
a disk. A disk slot number will be provided.
The below layout can help you locate the correct disk.
Note: Disk slots #12 and #13 are on the rear of the chassis.
#0 #9
#1 #10
#2 #11
#3 #6
#4 #7
#5 #8
Fig. 2.1: Dell PowerEdge disk layout
3.2 Procedure
GGChost1 GGChost2
Please refer to your switch’s documentation for the specific commands to configure the Ethernet ports facing the
GGC machines as follows:
• 10Gbps full duplex
• Set to auto-negotiate
• Portfast (Cisco IOS), Edge port (JUNOS), or equivalent
• No flow control
• All machines in the GGC node must be in a single, dedicated layer 2 broadcast domain
The following example is for illustration purposes only. Your configuration may vary. Please contact your switch
vendor for detailed configuration support for your specific equipment.
!
interface TenGigabitEthernet1/1
description GGChost1-Te1
switchport mode access
flowcontrol send off
spanning-tree portfast
!
interface TenGigabitEthernet1/2
description GGChost2-Te1
switchport mode access
flowcontrol send off
spanning-tree portfast
!
interface TenGigabitEthernet1/3
description GGChost3-Te1
switchport mode access
flowcontrol send off
spanning-tree portfast
!
interface TenGigabitEthernet1/4
description GGChost4-Te1
switchport mode access
flowcontrol send off
spanning-tree portfast
!
end
Note: For the port descriptions in your configuration, we recommend that you use the GGC hostnames e.g.
mynode-abc101-Te1
4.1 IPv4
Please configure reverse DNS entries for all servers’ IP addresses (both real and virtual addresses) to
cache.google.com.
The following example is for illustration purposes only (bind configuration):
$TTL 1D
$ORIGIN 10.10.10.in-addr.arpa.
@ IN SOA ...
4.2 IPv6
Note: The node relies on IPv6 Router Advertisements (RA) for the configuration of the IPv6 default gateway.
Adding IPv6 support to a GGC node is easy for both new or existing nodes.
For a new node, IPv6 can be enabled prior to installation by specifying an IPv6 subnet and IPv6 Router for
BGP Sessions when you supply the other technical information required for node activation in the ISP Portal.
For an existing node that is already serving IPv4 traffic, IPv6 can also enabled through the ISP Portal. From the
menu select ‘Configure > GGC’, select the node on which you wish to enable IPv6 and then choose ‘IPv6
Enablement’.
Click the ‘Edit’ button to enter or change the IPv6 subnet of the node and the IPv6 address of the BGP peer.
Note:
• the IPv6 subnet should be entered in CIDR notation, including the ‘/64’
• use of IPv6 link-local addresses for the BGPv6 peer is not allowed
No transparent proxies or filters may be placed in the path of communications between the GGC Node and
Google’s back-end servers.
Attention: Each server should have its own USB boot stick, as server specific configurations are stored on
the stick.
Download the install image ggc-setup<<VERSION>>.img. You can find the link for the Setup Image in the
‘Downloads’ section of the ISP Portal (https://isp.google.com/downloads), under ‘Setup Image’.
Once the image is downloaded, the USB boot sticks need to be created. This should be repeated for each USB
stick for each server.
Warning: All data on the USB stick will be erased when you load the GGC image.
5. Use the button in the ‘Image File’ group box to select the downloaded setup image file
(ggc-setup<<VERSION>>.img)
6. Press the ‘Write’ button to write the image to the USB stick and confirm the operation.
7. After a few seconds, a message box should appear, stating that the write operation was successful. The
USB stick can be removed.
1. Open a terminal.
2. Insert the USB stick, a new device (e.g., /dev/disk2s1) will appear.
3. Check if a partition on the device is mounted:
$ df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/disk1 112Gi 33Gi 78Gi 30% /
devfs 203Ki 203Ki 0Bi 100% /dev
map auto.auto 0Bi 0Bi 0Bi 100% /auto
map auto.home 0Bi 0Bi 0Bi 100% /home
map -hosts 0Bi 0Bi 0Bi 100% /net
/dev/disk2s1 7.5Gi 1.5Gi 6.0Gi 21% /Volumes/Cruzer
4. You will need to unmount the usb stick as follows:
$ diskutil umount /Volumes/Cruzer
Volume Cruzer on disk2s1 unmounted
5. Verify it is gone:
$ df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/disk1 112Gi 33Gi 78Gi 30% /
devfs 201Ki 201Ki 0Bi 100% /dev
map auto.auto 0Bi 0Bi 0Bi 100% /auto
map auto.home 0Bi 0Bi 0Bi 100% /home
map -hosts 0Bi 0Bi 0Bi 100% /net
6. Provided the USB stick is device /dev/disk2, the following command creates the bootable USB stick:
$ sudo dd if=/path/to/ggc-setup<<VERSION>>.img of=/dev/disk2 bs=1M
It will ask for your password and then copy the image to the USB Stick. After this command has
completed, the USB stick can be removed.
1. Open a terminal.
2. Insert the USB stick, a new device will appear (e.g., /dev/sdb). The device name can be checked using
‘dmesg‘:
$ dmesg
usb 1-4: new high speed USB device using ehci_hcd and address 5
scsi7 : usb-storage 1-4:1.0
scsi 7:0:0:0: Direct-Access Kingston DataTraveler G3 1.00 PQ: 0 ANSI: 0 CCS
sd 7:0:0:0: Attached scsi generic sg2 type 0
sd 7:0:0:0: [sdb] 7567964 512-byte logical blocks: (3.87 GB/3.60 GiB)
sd 7:0:0:0: [sdb] Write Protect is off
sd 7:0:0:0: [sdb] Mode Sense: 0b 00 00 08
sd 7:0:0:0: [sdb] Assuming drive cache: write through
sd 7:0:0:0: [sdb] Assuming drive cache: write through
sdb: sdb1
sd 7:0:0:0: [sdb] Assuming drive cache: write through
sd 7:0:0:0: [sdb] Attached SCSI removable disk
In this particular example, the device is /dev/sdb.
3. Make sure no partition on this device is mounted. The command ‘mount | grep /dev/sdb‘ should
not return any output. If it does, unmount the partition(s). Here is an example:
$ mount | grep /dev/sdb
/dev/sdb1 on /media/DEBIAN_LIVE type vfat
In this particular instance, the partition /dev/sdb1 is mounted. To unmount the partition:
$ sudo umount /dev/sdb1
Now, the command:
$ mount | grep /dev/sdb
returns no output, meaning no partition on this device is mounted.
4. Provided the USB stick is device /dev/sdb, the following command creates the bootable USB stick:
$ sudo dd if=/path/to/ggc-setup<<VERSION>>.img of=/dev/sdb
After this command has completed, the USB stick can be removed.
Warning: the GGC software installation will destroy data on the server’s disks
5. Press enter or wait for 10 seconds for the ‘Boot Menu’ to disappear. The system boots up and starts the
installation program.
The installer will examine the hardware configuration. Depending on the current configuration,
modifications to the BIOS and/or RAID controller configuration may be applied. This potentially requires
a reboot of the system. The installation program will restart automatically.
6. The installer will detect which NIC has a link - in this case the first 10GE interface - and will ask you to
confirm if you wish to configure it, as shown in NIC Detection. Press enter to proceed.
7. Enter the IP configuration for this particular server in the IP Information screen. The configuration should
match the IP information provided to Google in the ISP Portal:
• Enable LACP [N]: if you plan to use LACP, enter ‘Y’. Otherwise just press enter. Please note that
you can enable LACP even when there are not yet multiple network cables connected.
• Enter GGC node subnet in CIDR format (x.x.x.x/nn or x:x:x:x::/64).
• Enter the machine number.
8. Upon validation of the IP information and connectivity, the server will begin the local software installation.
This step will take a couple of minutes. Please be patient and allow it to finish.
9. When the installation process completes successfully, you will see a screen similar to Successful
installation. Press enter to reboot the server.
In case you notice warnings or error messages on the screen, please do not reboot the server (See section
‘When things go wrong‘).
The USB stick should be left in the machine, in case a re-installation is needed (See section ‘GGC
Software Reinstallation‘).
10. When the server reboots after a successful installation, it will boot from disk. The machine is now ready
for remote deployment.
11. Label each server with the machine number and IP address assigned to it.
Once the field setup process is complete, the setup program will automatically report the configuration to Google
so that the installation can be remotely completed and the node can be brought on-line.
In some cases (e.g., when the root disk has been replaced) a server needs to be re-installed. The procedure is very
similar to the one described above, the most important difference is that a manual intervention is required to boot
the server from the USB stick.
1. Connect monitor and keyboard to the server to be installed.
2. The setup USB boot stick should still be in a USB port, it has the previously entered network configuration
stored. If it is not, the USB stick can be recreated as described in section ‘Preparing the USB stick (drive)‘.
The stored network configuration will be lost and has to be entered again.
3. Start the server. During POST, a menu will appear on the screen, similar to the POST screen.
4. Press F11 to enter the Boot manager. Select the ‘BIOS Boot Menu’.
5. From the list of bootable devices, select ‘Hard disk C:’ and then the USB stick, as shown in the illustration
Boot from USB stick
6. Once the installation program is launched, you can proceed with the same steps as described in section
‘GGC Software Installation‘.
Note that, if the same USB stick is used as the one for the original installation, the settings for the network
configuration are prefilled with the information entered previously. Provided these data are still valid, you
can just press enter to proceed.
• If disks have been replaced or swapped, you may see messages during POST such as “there are missing or
virtual drives with preserved cache”. To resolve this issue, please do the following during system boot:
1. Press CTRL+R to enter the RAID Configuration Utility.
2. Select “Controller 0” at the “VD Management” screen
3. Press “F2 Operations” and choose “Manage Preserved Cache”
4. Select “Virtual drive, Missing”, choose “DISCARD CACHE”, choose “YES”
5. Exit
6. Reboot
• When network connectivity cannot be established, please check the cables, switch and router
configuration, and the IP information entered during installation.
• If the setup process encounters an error after network connectivity is established, it will automatically
report the error to Google for investigation. If this happens, please leave the server running with the USB
stick inserted.
• In other cases, please contact the GGC Operations team: [email protected]. Please always include a GGC
node name in all communications with us.
Note: Only a single session is permitted to each GGC node. Redundancy is not required as an interruption of
this session will not impact traffic flow at the node.
6.2 Procedure
• Your end of the session will be the router specified in the ISP Portal. Use your public ASN as provided in
the ISP Portal for your end of the eBGP session.
Note:
– BGP multihop is supported
– IPv6 link-local addresses are not allowed
• The GGC end of the session will be the last usable IP address in the GGC subnet (as in broadcast IP - 1)
for IPv4 and CIDR6::fffe for IPv6 (as in subnet IP + 65534). See also: IP Addressing. This is a virtual
address which will be configured after Google completes the remote installation.
Note:
– the GGC ASN for new nodes is always 11344
• The session should be configured in passive mode. The connection is always initiated by the GGC end.
Note:
– the session will not come up until Google completes the next step of the installation
• Do not configure monitoring on the BGP session. The GGC system does not interpret an interruption of
the BGP feed as a loss of the node. The node will continue to serve based on the most recent valid feed
received until the session is restored. Google will monitor the availability of the node and automatically
shift traffic away in the event of an outage. Normal management activity may briefly interrupt the session
at any time.
• For configuration simplicity, MD5 passwords are not recommended. MD5 passwords are supported, if
required.
Google Global Cache uses BGP only as a mechanism to communicate the prefixes of users that should be served
from a node. It is not used for routing or to determine if the cache is online. An interruption to the BGP session
has no effect on the cache.
In order for the GGC node to perform optimally, both the IP address of the user and the IP address of the
DNS resolver they are using must be advertised to the cache.
While the vast majority of end user traffic is delivered by GGC nodes based on the end user’s IP address alone, a
small subset of requests use the IP address of the DNS resolver being used by the end user.
Optionally, Google also supports EDNS and you can increase the number of queries based on the user’s actual IP
address by implementing Client Subnet Information in DNS Requests.
Besides mapping, the user IP addresses are used to build an access control list on the node itself.
Prefixes from other ASNs can be mapped to the cache as well, providing the following conditions are met:
• Both the DNS resolver and user prefixes must be advertised in order to both map and serve those users
from the cache
• If the other AS transits an AS with a peering relationship with Google, their traffic will not be mapped to
the cache. If an exception is required, please contact [email protected].
• Do not send the full Internet routing table to the GGC node. Only send the prefixes that should be served
from the node.
For configuration options and considerations when deploying multiple cache nodes in your network, see the
accompanying document ‘Multi-Node Concepts’ (GGCMultinodeDeployments.pdf).
The following examples are for illustration purposes only. Your configuration may vary. Please contact us if you
require additional support.
policy-statement no-routes {
term default {
then reject;
}
}
The default configuration of these servers is calibrated toward reducing power consumption. This is
accomplished by allowing the servers to run hotter than you might expect from other data center equipment. This
configuration will allow the exhaust temperature to rise to 50°C (122°F), likely 10°C - 15°C (18°F - 27°F) higher
than you might be used to. This is normal.
You may notice the fans spinning more slowly than you might expect. We have instrumentation in place that
collects this data, and have seen fans running at approximately 25% of their rated speed in warm environments,
with no difficulty in keeping the server to its desired temperature.
This change allows you to save power and cooling costs, with no adverse effects on the servers.
In the event you need to shut down or interrupt access to the GGC node(s) for scheduled maintenance of your
data center or network, it is important that you drain traffic away from the node(s) in order to avoid any end-user
impact.
Note: Please do not simply shutdown the node without draining traffic: users might experience service
inrruptions
The traffic drain should be scheduled in the ISP Portal (isp.google.com), as follows:
1. From the menu, select ‘Configure > GGC’ (you will see a list of all GGC nodes installed in your network)
2. Expand the node for which you would like to schedule maintenance
3. Select ‘Maintenance’, then ‘Add Maintenance’
4. Enter the Start and End Date/Time (in UTC) - note that traffic will be drained approximately 30 minutes
prior to the scheduled start time
5. Traffic will be restored automatically after the maintenance windows ends
If you need to power down the GGC machines as part of your maintenance, you can initiate a graceful shutdown
by pressing the power button on the front of the server once. It can take up to 5 minutes before the system shuts
itself down completely.
Our monitoring systems may detect that machines are unreachable and send you automated email alerts. Please
ignore these for the duration of the maintenance window.
Google’s monitoring system will remotely identify hardware failures. Your technical contact will be notified if
we require any local assistance, troubleshooting, or RMA coordination. If you believe that hardware is not
operating properly, please contact us at [email protected].
Note: Keep the Technical Contacts section of the ISP Portal (under ‘Configure > Network Info’)
(isp.google.com) up to date. The GGC operations team relies on this information in the event you need to be
contacted.
While no monitoring is required, some local monitoring can be helpful. However, it is important to understand
the following considerations:
• It can be helpful to monitor the availability and performance of the path between the GGC node subnet and
Google’s network. A sample host for video content origin is v1.cache1.googlevideo.com.
• If you monitor egress traffic from the node, bear in mind that traffic at the node will be impacted by
youtube.com maintenance and availability.
• Binary and configuration changes are regularly pushed to machines in the node in a rolling fashion. If you
are monitoring egress per machine, you will see occasional interruptions of service during the associated
restarts. The GGC software ensures that the load for the machine under service is spread around the node
during these events.
• The BGP session to the node is different from typical peering sessions. It is not used for routing or to
establish the availability of the node. Brief interruptions of the session are normal and will not impact user
traffic. If you are monitoring this session, you should not consider it an actionable alert unless the session
is down for longer than an hour.
Non-authoritative answer:
r3---sn-bjvg2-1gie.c.youtube.com
canonical name = r3.sn-bjvg2-1gie.c.youtube.com.
Name: r3.sn-bjvg2-1gie.c.youtube.com
Address: 193.142.125.14
If the resulting address (193.142.125.14 in this example) is an address in the subnet allocated to the GGC
node, the video is playing from the cache.
Note: The base web pages of www.youtube.com may not be served from the cache.
You can use FireFox as well to perform this test, but you will need to install the FireBug extension.
There are several possible reasons why a video may not play from the cache.
8.5.1 The user’s DNS resolver is not in the BGP feed to the GGC node.
One of the mechanisms the mapping system uses to send requests to a node is DNS. The DNS request from the
user will go to your resolver, which will then come to Google’s authoritative resolvers. If your resolver’s IP
address is in a prefix that is being advertised to the node, the IP address returned to your DNS resolver (and then
to the user) should be from the GGC node subnet. To determine the resolver Google is seeing, execute the
following command from your test client:
nslookup -q=txt o-o.myaddr.l.google.com
Use o-o.myaddr.l.google.com verbatim, do not substitute the myaddr part. This is a special host name that will
return the IP address of the DNS resolver as seen by Google. You should see a response similar to:
Non-authoritative answer:
o-o.myaddr.l.google.com.google.com
text = "<IP_address>"
Confirm that IP_address is in the BGP feed to the GGC node. A common error is using a test resolver that
forwards requests to another DNS server whose IP address is not in the BGP feed.
The mapping file is updated periodically, so it takes some time before changes in the advertised BGP feed are
picked up by the mapping system. If the address was added to the BGP feed within the last 24 hours, please
contact [email protected] to confirm that the change has been pushed to our production servers.
8.5.2 The client’s IP is not in the BGP feed to the GGC node
If the requested video is not playing from the cache, it is possible that the BGP feed does not include the test
client’s IP address. If this is the case, the cache will get the request and then redirect it to a cache outside your
network. Verify that the test client’s IP address is in the feed and has been there for at least 1 hour.
If the requested video plays from the cache sometimes, but not every time, the cache may be overflowing. As the
cache reaches its configured serving capacity, it will begin redirecting requests to external caches. The service
capacity of the cache is based on a combination of several factors:
• number of servers in the node
• number of interfaces connected on each server (LACP)
• available bandwidth provisioned between the cache and your network (reported in the ISP Portal)
• manually configured Egress Target
You can determine if the cache is overflowing by reviewing the ‘Demand With Overflow Graph‘ in the ISP Portal
(under ‘Monitoring > Serving Overview’).
If you suspect that the node should not be overflowing at the current traffic level, please review the Notifications
page: https://isp.google.com/notifications
Once all relevant notifications are addressed, if the problem persists, please contact us.
The cache will store the most popular videos your users are requesting. There is an admission mechanism that
can prevent a video from being cached on the first play. If this is the case, a second playback should come from
the cache.
During the installation process, the ‘Cache Node Status’ field in the ISP Portal, found under ‘Configure > GGC >
Node Information’ will provide information on the fulfillment, shipping, and turn-up status of the node.
The ‘Monitor’ menu in the ISP Portal , gives you access to traffic and performance graphs for the GGC node(s).
8.7.1 Traffic
The ‘Traffic’ tab shows graphs for Egress, Ingress and Demand With Overflow
• Egress is traffic from the cache sent towards the users
• Ingress traffic is cache fill coming from Google’s origin servers
• Demand With Overflow (discussed in section 8.5.3)
8.7.2 Performance
The Packet Retransmits Graph (under the ‘Performance’ tab) shows transmit packet loss (measured by
retransmissions) from the node. If you are observing slow video playbacks or significant rebuffering, this graph
can tell you if the node is having difficulty reaching your users. High packet loss is most often caused by
congestion or faults in the access network between the cache and the users. The issue can sometimes be traced to
a network bottleneck, such as improper load balancing across aggregated links, a faulty circuit, or a
malfunctioning interface.