Very Large Database
Very Large Database
Very Large Database
A very large database, (originally written very large data base) or VLDB,[1] is a database that contains a
very large amount of data, so much that it can require specialized architectural, management, processing
and maintenance methodologies.[2][3][4][5]
Definition
The vague adjectives of very and large allow for a broad and subjective interpretation, but attempts at
defining a metric and threshold have been made. Early metrics were the size of the database in a canonical
form via database normalization or the time for a full database operation like a backup. Technology
improvements have continually changed what is considered very large.[6][7]
One definition has suggested that a database has become a VLDB when it is "too large to be maintained
within the window of opportunity… the time when the database is quiet".[8]
VLDB challenges
Key areas where a VLDB may present challenges include configuration, storage, performance,
maintenance, administration, availability and server resources.[11]: 1 1
Configuration
Careful configuration of databases that lie in the VLDB realm is necessary to alleviate or reduce issues
raised by VLDB databases.[11]: 3 6–53 [12]
Administration
The complexities of managing a VLDB can increase exponentially for the database administrator as
database size increases.[13]
Best practice is for backup and recovery to be architectured in terms of the overall availability and business
continuity solution.[20][21]
Performance
Given the same infrastructure there may typically be a decrease in performance, that is increase in response
time as database size increases. Some accesses will simply have more data to process (scan) which will take
proportionally longer (linear time); while the indexes used to access data may grow slightly in height
requiring perhaps an extra storage access to reach the data (sub-linear time).[22] Other effects can be
caching becoming less efficient because proportionally less data can be cached and while some indexes
such as the B+ automatically sustain well with growth others such as a hash table may need to be rebuilt.
Should an increase in database size cause the number of accessors of the database to increase then more
server and network resources may be consumed, and the risk of contention will increase. Some solutions to
regaining performance include partitioning, clustering, possibly with sharding, or use of a database
machine.[23]: 3 90 [24]
Partitioning
Partitioning may be able assist the performance of bulk operations on a VLDB including backup and
recovery.,[25] bulk movements due to information lifecycle management (ILM),[26]: 3 [27]: 1 05–118 reducing
contention[27]: 3 27–329 as well as allowing optimization of some query processing.[27]: 2 15–230
Storage
In order to satisfy needs of a VLDB the database storage needs to have low access latency and contention,
high throughput, and high availability.
Server resources
The increasing size of a VLDB may put pressure on server and network resources and a bottleneck may
appear that may require infrastructure investment to resolve.[13][28]
See also
XLDB
References
1. "Oracle Database Online Documentation 11g Release 1 (11.1) / Database Administration
Database Concepts" (https://docs.oracle.com/cd/B28359_01/server.111/b28318/partconc.ht
m#CNCPT011). oracle. 18 Very Large Databases (VLDB). Retrieved 3 October 2018.
2. "Very Large Database (VLDB)" (https://www.techopedia.com/definition/14731/very-large-dat
abase-vldb). Technopedia. Archived (https://web.archive.org/web/20180704224849/https://w
ww.techopedia.com/definition/14731/very-large-database-vldb) from the original on 4 July
2018. Retrieved 3 October 2018.
3. Gaines, R. S. and R. Gammill. Very Large Data Bases: An Emerging Research Area,
Informal working paper, RAND Corporation
4. Data Processing Magazine (https://books.google.com/books?id=3JYgAAAAMAAJ). North
American Publishing Company. 1964. p. 18,58.
5. Widlake, Marin (18 September 2009). "What is a VLDB?" (https://mwidlake.wordpress.com/2
009/09/18/what-is-a-vldb/). mwidlake. Archived (https://web.archive.org/web/201810061147
29/https://mwidlake.wordpress.com/2009/09/18/what-is-a-vldb/) from the original on 6
October 2018. Retrieved 7 October 2018.
6. Sidley, Edgar H. (1 April 1980). Encyclopedia of Computer Science and Technology:
Volume 14 - Very Large Data Base Systems to Zero-Memory and Markov Information Source
(https://books.google.com/books?id=KUgNGCJB4agC). CRC Press. pp. 1–18.
ISBN 9780824722142.
7. Gerritsen, Rob; Morgan, Howard; Zisman, Michael (June 1977). "On some metrics for
databases or what is a very large database?" (https://doi.org/10.1145%2F984382.984393).
ACM SIGMOD Record. 9 (1): 50–74. doi:10.1145/984382.984393 (https://doi.org/10.1145%2
F984382.984393). ISSN 0163-5808 (https://www.worldcat.org/issn/0163-5808).
S2CID 6359244 (https://api.semanticscholar.org/CorpusID:6359244).
8. Rankins, Ray; Jensen, Paul; Bertucci, Paul (18 December 2002). "21" (https://archive.org/de
tails/microsoftsqlserv00rayr). Microsoft SQL Server 2000 (https://archive.org/details/microsoft
sqlserv00rayr) (2nd ed.). SAMS. ISBN 978-0672324673. Administering Very Large SQL
Server Databases.
9. "Oracle Database Release 18 - VLDB and Partitioning Guide" (https://docs.oracle.com/en/d
atabase/oracle/oracle-database/18/vldbg/partition-intro.html). Oracle. 1 Introduction to Very
Large Databases. Archived (https://web.archive.org/web/20181003205734/https://docs.oracl
e.com/en/database/oracle/oracle-database/18/vldbg/partition-intro.html) from the original on
3 October 2018. Retrieved 3 October 2018.
10. "The Very Large Database Problem - How to Backup & Recover 30–100 TB Databases" (htt
p://cdn2.hubspot.net/hubfs/214442/Actifio_For_Very_Large_Databases_White_Paper.pdf)
(PDF). actifio. Archived (https://web.archive.org/web/20180219182335/http://cdn2.hubspot.n
et/hubfs/214442/Actifio_For_Very_Large_Databases_White_Paper.pdf) (PDF) from the
original on 19 February 2018.
11. Hussain, Syed Jaffer (2014). "Tuning & Applying Best Practices On Very Large Databases
(VLDB)" (http://www.aioug.org/sangam14/images/Sangam14/Presentations/201461_Hussai
n_ppt.pdf) (PDF). Sangam: AIOUG. Archived (https://web.archive.org/web/20181004205048/
http://www.aioug.org/sangam14/images/Sangam14/Presentations/201461_Hussain_ppt.pdf)
(PDF) from the original on 4 October 2018.
12. Chaves, Warner (7 January 2015). "Top 10 Must-Do Items for your SQL Server Very Large
Database" (http://sqlturbo.com/top-10-must-do-items-for-your-sql-server-very-large-databas
e/). SQLTURBO. Archived (https://web.archive.org/web/20171213085742/http://sqlturbo.co
m/top-10-must-do-items-for-your-sql-server-very-large-database/) from the original on 13
December 2017. Retrieved 5 October 2018.
13. Furman, Dimitri (22 January 2018). Rajesh Setlem; Mike Weiner; Xiaochen Wu (eds.). "SQL
Server VLDB in Azure: DBA Tasks Made Simple" (https://blogs.msdn.microsoft.com/sqlcat/2
018/01/22/sql-server-vldb-in-azure-dba-tasks-made-simple/). MSDN. Archived (https://web.a
rchive.org/web/20181006072244/https://blogs.msdn.microsoft.com/sqlcat/2018/01/22/sql-ser
ver-vldb-in-azure-dba-tasks-made-simple/) from the original on 6 October 2018. Retrieved
6 October 2018.
14. "Specialized Requirements for Relational Data Warehouse Servers" (https://web.archive.or
g/web/19971010114605/http://www.redbrick.com/rbs-g/whitepapers/tenreq_wp.html). Red
Brick Systems, Inc. 21 June 1996. Archived from the original (http://www.redbrick.com/rbs-g/
whitepapers/tenreq_wp.html) on 10 October 1997.
15. "Cluster design considerations" (https://developer.couchbase.com/documentation/server/3.x/
admin/Concepts/bp-clusterDesign.html). Crouchbase. Archived (https://web.archive.org/web/
20181017195247/https://developer.couchbase.com/documentation/server/3.x/admin/Conce
pts/bp-clusterDesign.html) from the original on 17 October 2018. Retrieved 17 October 2017.
16. "Cross Datacenter Replication (XDCR)" (https://developer.couchbase.com/documentation/s
erver/3.x/admin/XDCR/xdcr-intro.html). Crouchbase. Archived (https://web.archive.org/web/2
0181017195516/https://developer.couchbase.com/documentation/server/3.x/admin/XDCR/x
dcr-intro.html) from the original on 17 October 2018. Retrieved 17 October 2017.
17. Chien, Tim. "Snapshots Are NOT Backups" (https://www.oracle.com/technetwork/database/a
vailability/rman-fra-snapshot-322251.html). Oracle technetwork. Archived (https://web.archiv
e.org/web/20180907091910/https://www.oracle.com/technetwork/database/availability/rman
-fra-snapshot-322251.html) from the original on 7 September 2018. Retrieved 10 October
2018.
18. "Using a split mirror as a backup image" (https://www.ibm.com/support/knowledgecenter/en/
SSEPGG_9.5.0/com.ibm.db2.luw.admin.ha.doc/doc/t0006423.html). IBM Knowledge
Center. Archived (https://archive.today/20180109160158/https://www.ibm.com/support/knowl
edgecenter/en/SSEPGG_9.5.0/com.ibm.db2.luw.admin.ha.doc/doc/t0006423.html) from the
original on 9 January 2018. Retrieved 10 October 2018.
19. "Chapter 1 High Availability and Scalability" (https://dev.mysql.com/doc/mysql-ha-scalability/
en/ha-overview.html). dev.mysql. Archived (https://web.archive.org/web/20161215030829/htt
ps://dev.mysql.com/doc/mysql-ha-scalability/en/ha-overview.html) from the original on 15
December 2016. Retrieved 12 October 2018.
20. Brooks, Charlotte; Leung, Clem; Mirza, Aslam; Neal, Curtis; Qiu, Yin Lei; Sing, John; Wong,
Francis TH; Wright, Ian R (March 2007). "Chapter 1. Three Business solution segments
defined". IBM System Storage Business Continuity: Part 2 Solutions Guide. IBM Redbooks.
ISBN 978-0738489728.
21. Akhtar, Ali Navid; Buchholtz, Jeff; Ryan, Michael; Setty, Kumar (2012). "Database Backup
and Recovery Best Practices" (https://www.isaca.org/Journal/archives/2012/Volume-1/Page
s/Database-Backup-and-Recovery-Best-Practices.aspx). Archived (https://web.archive.org/w
eb/20180629131442/https://www.isaca.org/Journal/archives/2012/Volume-1/Pages/Databas
e-Backup-and-Recovery-Best-Practices.aspx) from the original on 29 June 2018. Retrieved
12 October 2012.
22. Tariq, Ovais (14 July 2011). "Understanding B+tree Indexes and how they Impact
Performance" (http://www.ovaistariq.net/733/understanding-btree-indexes-and-how-they-imp
act-performance/). ovaistariq.net. Archived (https://web.archive.org/web/20180207203602/htt
p://www.ovaistariq.net/733/understanding-btree-indexes-and-how-they-impact-performance/)
from the original on 7 February 2018. Retrieved 10 October 2018.
23. Shrestha, Raju (2017). High Availability and Performance of Database in the Cloud -
Traditional Master-slave Replication versus Modern Cluster-based Solutions (https://www.re
searchgate.net/publication/317299391). 7th International Conference on Cloud Computing
and Services. Vol. 1: CLOSER. SCITEPRESS – Science and Technology Publications, Lda.
doi:10.5220/0006294604130420 (https://doi.org/10.5220%2F0006294604130420).
ISBN 978-989-758-243-1. Archived (https://web.archive.org/web/20181017152557/https://w
ww.researchgate.net/publication/317299391_High_Availability_and_Performance_of_Data
base_in_the_Cloud_-_Traditional_Master-slave_Replication_versus_Modern_Cluster-base
d_Solutions) from the original on 17 October 2018.
24. "Encyclopedia" (https://www.pcmag.com/encyclopedia/term/40879/database-machine).
Definition of: database machine. Archived (https://web.archive.org/web/20160704205410/htt
p://www.pcmag.com/encyclopedia/term/40879/database-machine) from the original on 4
July 2016. Retrieved 10 October 2018.
25. Burleson, Donald (26 March 2015). "Oracle Backup VLDB tips" (http://www.dba-oracle.com/t
_backup_vldb.htm). Burleson Consulting. Archived (https://web.archive.org/web/201706302
23240/http://www.dba-oracle.com/t_backup_vldb.htm) from the original on 30 June 2017.
Retrieved 11 October 2016.
26. "Oracle Partitioning in Oracle Database 12c Release 2 Extreme Data Management and
Performance for every System" (https://www.oracle.com/technetwork/database/options/partiti
oning/partitioning-wp-12c-1896137.pdf) (PDF). Oracle. March 2017. Archived (https://web.ar
chive.org/web/20171215074909/https://www.oracle.com/technetwork/database/options/partit
ioning/partitioning-wp-12c-1896137.pdf) (PDF) from the original on 15 December 2017.
Retrieved 17 October 2018.
27. Teske, Thomas (8 February 2018). Get the best out of Oracle Partitioning - A practical guide
and reference (https://indico.cern.ch/event/697301/attachments/1598206/2532649/Partitioni
ng_guide_v18.pdf) (PDF) (Speech). Cern. Hermann Bär. 40-S2-C01 - Salle Curie (CERN):
Oracle. Archived (https://web.archive.org/web/20181012172456/https://indico.cern.ch/event/
697301/attachments/1598206/2532649/Partitioning_guide_v18.pdf) (PDF) from the original
on 12 October 2018. Retrieved 12 October 2018.
28. Steel, Phil; Poggemeyer, Liza; Plett, Corey (1 August 2018). "Server Hardware Performance
Considerations" (https://docs.microsoft.com/en-us/windows-server/administration/performan
ce-tuning/hardware/). Microsoft IT Pro Center. Archived (https://web.archive.org/web/201810
17175544/https://docs.microsoft.com/en-us/windows-server/administration/performance-tuni
ng/hardware/) from the original on 17 October 2018. Retrieved 17 October 2018.
29. Li, Yishan; Manoharan, Sathiamoorthy (2013). A performance comparison of SQL and
NoSQL databases. 2013 IEEE Pacific Rim Conference on Communications, Computers and
Signal Processing (PACRIM). IEEE. p. 15. doi:10.1109/PACRIM.2013.6625441 (https://doi.o
rg/10.1109%2FPACRIM.2013.6625441). ISBN 978-1-4799-1501-9.
Retrieved from "https://en.wikipedia.org/w/index.php?title=Very_large_database&oldid=1136354393"